Executed: Mon Mar 27 11:39:53 2017
Duration: 5 seconds.
where
$$Lk = I_{D_{ex}} \, \sigma_{D_{ex}}^D \, \phi_D \, \eta_{A_{det}}^{D_{em}} \, (1-E)$$$$Dir = I_{D_{ex}} \, \sigma_{D_{ex}}^A \, \phi_A \, \eta_{A_{det}}^{A_{em}}$$$$\gamma = \frac{\phi_A\,\eta_{D_{det}}^{A_{em}}}{\phi_D\,\eta_{D_{det}}^{D_{em}}}$$We previously fitted the leakage and gamma coefficient from the RAW PR values
for the 5 dsDNA measurements. We also fitted the
direct excitation coefficient expressed (dir_ex_aa
) as a function of the
A-signal during A-excitation (naa
). In symbols, dir_ex_aa
is defined as:
for a A-only population.
Alternatively, we can express the direct excitation contribution ($Dir$) as a function of the total corrected burst size:
$$ Dir = d_T\, (n_a + \gamma n_d)$$With this definition, expressing $d_T$ as a function of the physical parameters we obtain:
$$d_T = \frac{\sigma_{D_{ex}}^A}{\sigma_{D_{ex}}^D} $$where $\sigma_{Dex}^A$ and $\sigma_{Dex}^D$ are the absorption cross-sections of the Acceptor and Donor dye at wavelength of Donor laser.
Finally, remembering the definition of $\beta$:
$$ \beta = \frac{I_{A_{ex}}\sigma_{A_{ex}}^A}{I_{D_{ex}}\sigma_{D_{ex}}^D}$$we can express $d_T$ as the product of $\beta$ and $d_{AA}$:
$$ d_T = \beta \, d_{AA}$$Note that $d_T$ is a property of the Donor-Acceptor dyes pair and of the Donor excitation wavelength. As such, differently from $d_AA$, the $d_T$ coefficient is valid for the same sample in any setup using the same donor excitation wavelength, such as the single-spot μs-ALEX and the multi-spot system. Additionally, $d_T$ allows to correct for direct acceptor excitation using only donor-excitation quantities. Therefore the same correction formula can be used both in two-laser (e.g. single-spot μs-ALEX) and single-laser systems (e.g. 8-spot system).
We use two different procedures both yielding an estimation of $d_T$. Except for the numerical accuracy the two procedures are equivalent.
From the previous relation between $d_T = \beta \,d_{AA}$ is possible to directly estimate $d_T$ with the values of $\beta$ and $d_{AA}$ we already fitted in previous notebooks.
It is possible to go from the raw $E_R$ (only background correction, no leakage, direct excitation nor gamma) to the fully-corrected $E$ using the formula:
$$ E = f(E_R,\, \gamma,\, L_k,\, d_T) = \frac{E_{R} \left(L_{k} + d_T \gamma + 1\right) - L_{k} - d_T \gamma} {E_{R} \left(L_{k} - \gamma + 1\right) - L_{k} + \gamma}$$We can compute the corrected $E$ for the 5 dsDNA samples by fitting the fully-corrected histograms (histograms with γ, leakage and direct excitation corrections). We can also fit the 5 $E_R$ values for the same samples from the proximity ratio histograms (only background correction).
Therefore, using the previous formula we can fit $d_T$ (dir_ex_t
)
by minimizing the error between the 5 $E$ values fitted from
corrected histograms and the 5 $E$ values obtained correcting
the 5 $E_R$ values from the fit of the proximity ratio histograms.
In [1]:
import os
import numpy as np
import pandas as pd
import lmfit
from fretbursts import *
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
%config InlineBackend.figure_format='retina' # for hi-dpi displays
sns.set_style('whitegrid')
In [2]:
#bsearch_ph_sel = 'AND-gate'
bsearch_ph_sel = 'Dex'
data_file = 'results/usALEX-5samples-PR-raw-%s.csv' % bsearch_ph_sel
These are the RAW proximity ratios for the 5 samples (only background correction, no leakage nor direct excitation):
In [3]:
data_raw = pd.read_csv(data_file).set_index('sample')
data_raw[['E_gauss_w', 'E_kde_w', 'E_gauss_w_err', 'E_gauss_w_fiterr', 'n_bursts_all', 'n_bursts_fret']]
Out[3]:
Columns legend
E_gauss_w
, center of Gaussian peak, fitted on the weighted histograms.E_kde_w
, maximum of the KDE computed on the weighted histograms.E_gauss_w_err
is $\frac{\sigma}{\sqrt{n_{FRET}}}$, where $\sigma$ is the std.dev. of the Gaussian peak and $n_{FRET}$ is the number of bursts in the FRET population.E_gauss_w_fiterr
is the standard error returned by the non-linear least square fit. It is computed
from the Jacobian taking into account the degrees of freedom.n_bursts_all
, total number of burstsn_bursts_fret
, number of bursts in the FRET populationAnd these are the FRET efficiencies fitted from corrected histograms for the same 5 samples:
In [4]:
data_file = 'results/usALEX-5samples-E-corrected-all-ph.csv'
data_corr = pd.read_csv(data_file).set_index('sample')
data_corr[['E_gauss_w', 'E_kde_w', 'E_gauss_w_err', 'E_gauss_w_fiterr', 'n_bursts_all', 'n_bursts_fret']]
Out[4]:
Columns legend
E_gauss_w
, center of Gaussian peak, fitted on the weighted histograms.E_kde_w
, maximum of the KDE computed on the weighted histograms.E_gauss_w_err
is $\frac{\sigma}{\sqrt{n_{FRET}}}$, where $\sigma$ is the std.dev. of the Gaussian peak and $n_{FRET}$ is the number of bursts in the FRET population.E_gauss_w_fiterr
is the standard error returned by the non-linear least square fit. It is computed
from the Jacobian taking into account the degrees of freedom.n_bursts_all
, total number of burstsn_bursts_fret
, number of bursts in the FRET population
In [5]:
raw_data_file_sna = 'results/alix/us-ALEX SNA Results 2016-10-12.csv'
In [6]:
rsna = pd.read_csv(raw_data_file_sna, index_col=0)
rsna
Out[6]:
In [7]:
sna = rsna[['<E>', 'Mode']].round(4)
sna.columns = ['SNA_E_mean', 'SNA_E_max']
sna
Out[7]:
In [8]:
data_file_sna = 'results/usALEX-5samples-E-SNA.csv'
In [9]:
sna.to_csv(data_file_sna)
In [10]:
leakage_coeff_fname = 'results/usALEX - leakage coefficient DexDem.csv'
leakage = np.loadtxt(leakage_coeff_fname)
print('Leakage coefficient:', leakage)
In [11]:
dir_ex_coeff_fname = 'results/usALEX - direct excitation coefficient dir_ex_aa.csv'
dir_ex_aa = np.loadtxt(dir_ex_coeff_fname)
print('Dir. excitation AA:', dir_ex_aa)
In [12]:
dir_ex_t_datasheet_fname = 'results/Dyes - ATT0647N-ATTO550 abs X-section ratio at 532nm.csv'
dir_ex_t_datasheet = np.loadtxt(dir_ex_t_datasheet_fname)
print('Direct excitation (dir_ex_t) from datasheet:', dir_ex_t_datasheet)
In [13]:
gamma_coeff_fname = 'results/usALEX - gamma factor - all-ph.csv'
gamma = np.loadtxt(gamma_coeff_fname)
print('Gamma factor:', gamma)
In [14]:
beta_coeff_fname = 'results/usALEX - beta factor - all-ph.csv'
beta = np.loadtxt(beta_coeff_fname)
print('Beta factor:', beta)
Compute $d_T$ using $\beta$ and $d_{AA}$:
In [15]:
dir_ex_t_beta = dir_ex_aa * beta
'%.5f' % dir_ex_t_beta
Out[15]:
In [16]:
with open('results/usALEX - direct excitation coefficient dir_ex_t beta.csv', 'w') as f:
f.write('%.5f' % dir_ex_t_beta)
With this coefficient, computing the corrected $E$ for the 5 dsDNA samples we obtain:
In [17]:
PR_corr_kde = fretmath.correct_E_gamma_leak_dir(data_raw.E_kde_w,
leakage=leakage,
dir_ex_t=dir_ex_t_beta,
gamma=gamma)*100
PR_corr_kde
Out[17]:
In [18]:
PR_corr_gauss = fretmath.correct_E_gamma_leak_dir(data_raw.E_gauss_w,
leakage=leakage,
dir_ex_t=dir_ex_t_beta,
gamma=gamma)*100
PR_corr_gauss
Out[18]:
The coefficient $d_T$ can be estimated from data-sheet values of $\sigma_{D_{ex}}^A$ and $\sigma_{D_{ex}}^D$.
Using the datasheet values provided by ATTOTec (in PBS buffer) we obtain a $d_T$ estimation close to 10%:
In [19]:
dir_ex_t_datasheet
Out[19]:
With this the corrected $E$ for the 5 dsDNA samples are:
In [20]:
E_datasheet = fretmath.correct_E_gamma_leak_dir(data_raw.E_kde_w,
leakage=leakage,
dir_ex_t=dir_ex_t_datasheet,
gamma=gamma)*100
E_datasheet
Out[20]:
Comparing these values with the ones obtained fitting the corrected E histograms we observe a significant discrepancy:
In [21]:
out = data_corr[['E_kde_w']].copy()*100
out.columns = ['E_alex']
out['E_datasheet'] = E_datasheet
out
Out[21]:
In [22]:
out.plot(alpha=0.4, lw=3, style=dict(E_alex='-ob', E_datasheet='-sr'));
NOTE: The corrected FRET efficiencies using the datasheet and μs-ALEX-based direct excitation do not match well.
In [23]:
def residuals_absolute(params, E_raw, E_ref):
dir_ex_t = params['dir_ex_t'].value
return E_ref - fretmath.correct_E_gamma_leak_dir(E_raw,
leakage=leakage,
gamma=gamma,
dir_ex_t=dir_ex_t)
In [24]:
def residuals_relative(params, E_raw, E_ref):
dir_ex_t = params['dir_ex_t'].value
return (E_ref - fretmath.correct_E_gamma_leak_dir(E_raw,
leakage=leakage,
gamma=gamma,
dir_ex_t=dir_ex_t))/E_ref
In [25]:
params = lmfit.Parameters()
params.add('dir_ex_t', value=0.05)
In [26]:
m = lmfit.minimize(residuals_absolute, params, args=(data_raw.E_kde_w, data_corr.E_kde_w))
lmfit.report_fit(m.params, show_correl=False)
In [27]:
m = lmfit.minimize(residuals_relative, params, args=(data_raw.E_kde_w, data_corr.E_kde_w))
lmfit.report_fit(m.params, show_correl=False)
NOTE: The fitted
dir_ex_t
is 4.5% as opposed to 10.6% as expected from the absorption spectra of ATTO550 and ATTO647 at 532nm.
In [28]:
'%.5f' % m.params['dir_ex_t'].value
Out[28]:
In [29]:
with open('results/usALEX - direct excitation coefficient dir_ex_t fit.csv', 'w') as f:
f.write('%.5f' % m.params['dir_ex_t'].value)
In [30]:
PR_corr_kde_dfit = fretmath.correct_E_gamma_leak_dir(data_raw.E_kde_w,
leakage=leakage,
dir_ex_t=m.params['dir_ex_t'].value,
gamma=gamma)*100
PR_corr_kde_dfit.name = 'PR_corr_kde_dfit'
PR_corr_kde_dfit
Out[30]:
In [31]:
E = pd.concat([data_corr[['E_kde_w', 'E_gauss_w']]*100, PR_corr_kde, PR_corr_gauss, sna*100], axis=1)
E.columns = ['E KDE', 'E Gauss', 'PR KDE', 'PR Gauss', 'SNA E mean', 'SNA E max']
E
Out[31]:
In [32]:
E.plot.bar(table=np.round(E, 2).T)
plt.ylabel('FRET (%)')
plt.gca().xaxis.set_visible(False)
#plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.);
In [33]:
Eerr = (pd.DataFrame().assign(
E_gauss_w_err = data_corr.E_gauss_w_err, E_gauss_w_fiterr = data_corr.E_gauss_w_fiterr,
PR_gauss_w_err = data_raw.E_gauss_w_err, PR_gauss_w_fiterr = data_raw.E_gauss_w_fiterr,
SNA_SDV=rsna['SDV(E)']))*100
Eerr
Out[33]:
In [34]:
nbursts = data_corr[['n_bursts_all', 'n_bursts_fret']]
nbursts
Out[34]:
The standard error for the Gausian peak fit is reported both as the error returned from the fit routine (*gauss_w_fiterr
) and as $\frac{\sigma}{\sqrt{n_{FRET}}}$ (*gauss_w_err
, see Columns legend above for more details). The columns starting with E_
are from fit of the corrected FRET histograms.
The columns starting with PR_
are from fit of the PR histograms. In principle, the PR errors
should be propagated through the correction formula, but this prpagation is not reported here
(error is small).
The KDE method does not naturally provide a standard error. In principle the error can be computed using the bootstrap method, which is computationally intensive and is not reported here.
The SNA method returns a distribution of FRET efficiencies that, when shot-noise is added, best fits the experimental histogram. We report estimates of E using the max or the mean of the E distribution. We report the error as the standard deviation of the E distribution. Therefore this value is common to both E estimates (max or mean).
In [35]:
E[['PR KDE', 'PR Gauss', 'E KDE']].plot(kind='bar')
E[['PR KDE', 'PR Gauss', 'E KDE']].plot(lw=3);
print('Max error E_alex vs E_corr_pr: %.2f' % (E['E KDE'] - E['PR KDE']).abs().max())
print('Max error E_alex vs E_beta: %.2f' % (E['E KDE'] - E['PR Gauss']).abs().max())
print('Max error E_beta vs E_corr_pr: %.2f' % (E['PR Gauss'] - E['PR KDE']).abs().max())
In [36]:
x = [int(idx[:-1]) for idx in out.index]
plt.plot(x, 'E KDE', data=E)
plt.plot(x, 'PR KDE', data=E)
plt.plot(x, 'PR Gauss', data=E)
plt.xlabel('Distance in base-pair')
plt.ylabel('FRET');
In [37]:
E['E KDE'] - E['PR KDE']
Out[37]:
NOTE: Fitting $d_T$ to match $E$ from corrected histograms with $E$ from PR correction formula produces a max difference of 1% for the 12d sample. The match is well below the fitting accuracy (> 2%).
In [38]:
E.to_csv('results/usALEX-5samples-E-all-methods.csv', float_format='%.3f')
In [39]:
E.round(3)
Out[39]:
In [40]:
Eerr.to_csv('results/usALEX-5samples-E-all-methods_errors.csv', float_format='%.4f')